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ABSTRACT 



y This invention relates to a text-to-speech conversion system 
^]2-p^-^TTi;) for interlocking with rm\ltimedia and a method for 
organizing input data of the sameV A conventional TTS is in 
situation of the only for the synthesis of speech from the 
inputted text. In addition, by a prior organization, since it 
is impossible to presume from only\ the text the information 
required when moving picture is to be dubbed by use of TTS or 
when the natural interlock between tme synthesized speech and 
multimedia such as animation is to be limplemented, there is no 
method to realize these function. Furthermore, there is also no 
result of the studies on use of additional data for enhancement 
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of the natural in the synthesized speech and organization of 
these data. Therefore, an object of the Vpresent invention is to 
provide a text-to-speech conversion system (TTS) for interlocking 
multimedia and a method for organizing input data of the same for 
enhancing the natural of synthesized speech\and accomplishing the 
synchronization of multimedia with TTS byl defining additional 
prosody information, the information required to interlock TTS 
with multimedia, and interface between these\ information and TTS 
for use in the production of the synthesized! speech. According 
to the present invention, a foreign movie can pe dubbed in Korean 
by implementing the synchronization of the synthesized speech 
with the moving picture by way of the di :ect use of text 



information and lip-shape information which i 
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analysis of actual speech data ar.d lip-shape in the moving 



picture for the production of the 



furthermore, the present invention is applicable to a variety of 



field such as communication service. 



synthesized speech. Still 



office automation, education 



and so on by making the synchronization between the picture 



information and the TTS in the multi 



media environment possible. 
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conversion system (TTS) for interlocking with multimedia 
comprising \the steps of: 

classifying multimedia input information organized for 
enhancing the Viatural of synthesized speech and implementing the 
synchronizatioA of multimedia with TTS into text, prosody, the 
information on aynchronizat ion with moving picture, lip-shape, 
and individual property information in a multimedia information 
input unit ; 

distributing the information classified in the multimedia 
information input in ^a data distributor by each media, based on 
respective information 

converting text disVributed in the data distributor by each 
media into phoneme stream, presuming prosody information and 
symbolizing the information in a language processor; calculating 
a value of prosody control pari^eteiz other than prosody control 




parameter included in multyo^media information in a prosody 
processor; 

adjusting the duration Y every each phoneme in 
synchronization adjuster so that processing result in the prosody 
processor may be synchronized withXa picture signal according to 
input of the synchronization information; 

producing the synchronized speech in a signal processor 
using the prosody information from theXdata distributor by each 
media, the processing result in the synch\:onization adjuster, and 
a synthesis unit database; and 

outputting the picture information disvtributed by the data 



distributor by each media onto a screen ii\ a picture output 
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appairatus 



3. ThA method according to claim 2, wherein said organized 
multimeddNa information is comprised of text information, prosody 
information, information synchronized with moving picture, lip- 
shape and individuality information. 



4. The methosi according to claim 3, wherein said prosody 
information is comprised of the number of phoneme, phoneme stream 
information, duration time of each phoneme, pitch pattern of the 
phoneme and energy pattern of the phoneme . 



5. The method according^^ to claim 4, wherein said duration of the 
phoneme is indicative of\a value of pitch at beginning point, 
middle point, and end point within the phoneme. 



6. The method according to clVim 4, wherein said energy pattern 
of the phoneme is indicative of\a/value of energy in decibel at 
beginning point , mid point and end point within phoneme . 



7. The method according to claim 2, wherein said synchronization 
information is comprised of text, lip-shape, location information 
with moving picture, and the duration information. 



8. The method according to claim 2, wherein said synchronization 
information is composed of a beginning point, (duration and delay 
time information of starting point, and durationvof each phoneme 
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is corjitrolled by said synchronization information 



9. The method according to claim 2, wherein said synchronization 
informatrpn is composed of a duration of the beginning point of 
a sentenced and a duration information of starting point, and 
duration of\ each phoneme is controlled by forecast lip-shape 
considered an >^rticulation manner of the phoneme and articulation 
control , 

lip-shape withir\ the synchronization and duration information 
composed of said synchronization information. 



10. The method according to claim 2, wherein said synchronized 
speech is produced by an information of beginning point and end 
point of each phoneme\ related with speech signal and an 
information of phoneme. 



11. The method according t 



opening) between upper lip a 



iclatm 2, wherein said synchronized 



speech is produced by a numearalization of distance (extent of 



>w lip, distance (extent of width) 



between left and right end poifntB of lip, and extent of 
projecting of lip and the \y.p-sn\ape quantized and normalized 
pattern depended on articulation location and articulation manner 
of the phoneme on the basis of pattemi with high discriminative 



property. 



12. The method according to claim 2, whertein said transmission 
method of multimedia information comprisingXthe steps of: 
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inverting a prosody information existed in the multimedia 



informatNion into a data structure capable of utilizing in the 
signal processor; 

transmitting the converted prosody information to the 
prosody and theN^ynchroni^at^n adjuster; 

converting theSprosqay infefrmation output ed from the prosody 



and the synchronization /ad j 



ustor to a data structure capable of 



utilizing in the syni 



htesis unit database and the prosody 



processor within the TT3 i/fXthe prosody information is included 
in said multimedia input /inforrnat ion; 

transmitting then to the sy^hesis unit database and the 
prosody processor if the individuaJ>s. property information is 
included in said multimedia input information. 
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